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Abstract 

Optimization of ship routing depends on several parameters, like ship and 
cargo characteristics, environmental factors, topography, international nav- 
igation rules, crew comfort etc. The complex nature of the problem leads 
to oversimplifications in analytical techniques, while stochastic methods like 
simulated annealing can be both time consuming and sensitive to local min- 
ima. In this work, a hybrid parallel genetic algorithm - estimation of distri- 
bution algorithm is developed in the island model, to operationally calculate 
the optimal ship routing. The technique, which is applicable not only to 
clusters but to grids as well, is very fast and has been applied to very diffi- 
cult environments, like the Greek seas with thousands of islands and extreme 
micro-climate conditions. 
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1. Introduction 

Optimization of ship routing is closely related to both ship characteris- 
tics and environmental factors and has a significant infiuence on economical, 
safety and comfort considerations. Ship size, speed capabihty and type and 
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scheduling of cargo are important considerations in the route selection pro- 
cess prior to sailing and the surveillance procedure while underway. Ship's 
characteristics identify its vulnerability to adverse conditions and its ability 
to avoid them while cargo type and scheduling identify the safety standards 
that have to be fulfilled and the economical impact of a certain route 

On the other hand, environmental factors of importance to ship routing 
are those elements of the atmosphere and ocean that may produce a change in 
the status of a ship transit. In ship routing, consideration is given to wind, 
waves, fog and ocean currents. While all of the environmental factors are 
important for route selection and surveillance, optimum routing is normally 
considered attained if the effects of wind and waves can be optimized. More 
details about the effect of environmental factors can be found in 

The problem of calculating an optimal or near optimal ship route grows 
very fast in complexity with the inclusion of several realistic constraints, like 
the existence of islands, international or national navigation rules, micro- 
climate and local parameters, etc. Any trial for an efficient analytical so- 
lution soon will be locked in oversimplifications, while heuristics have been 
proved, during the last years, capable to achieve acceptable solutions in such 
comj)licated problems. See for example the Traveling Salesman Problem (j^ 
the Time- Table Problem (|5|), the Quadratic Assi gnm ent Problem 
and j^), the Job Shop Schedu ling Problem ([l^l, ll], 12], l3l and 




the Airline Scheduling Problem ([l5|) and others. The best results 
found for many practical or academic optimization problems are obtained by 
hybrid algorithms. Combination of algorithms such as descent local search 



16l |. simulated annealing [171], tabu search [18| and evolutionary algorithms 
have provided very powerful search algorithms. 

Beyond the complexity of the optimal ship routing problem, one has to 
deal with the necessity to build a system that can respond to user requests 
in a reasonable amount of time. A user may identify himself which means 
that the system has registered information about the user's ship. After that, 
the user gives its departure and arrival locations with the desired departure 
time. The system responds with the optimal route calculated, taking into 
account the user's requirements (for example, small and medium ships are 
more interested in safety and comfort, while storeships are interested in fuel 
consumption and cargo scheduling). In any case, the response should be 
provided immediately (after a few seconds) otherwise, it is very likely that 
users exhibit disinclination in using the system. 

Finally, one has to take special care for the well-known exploration- 
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exploitation trade-off. Exploration is needed to ensure every part of the 
search space is searched thoroughly in order to provide a reliable estimate 
of the global optimum. Exploitation is important since the refinement of 
the current solution will often produce a better solution. Consider the case 
where in a given landscape with mountains and valleys, one wants to calcu- 
late the shortest path between two given points. In order to roughly locate 
this path, one has to observe the landscape from a long distance, so he can 
obtain a general idea of all the possible routes in the landscape. Thus, in our 
example, exploration means to observe from far away. On the other hand, 
after locating the possible optimal route, on has to take a closer look at the 
landscape in order to locate local obstacles and obtain a fine tuning of his 
path. Thus, exploitation in our example means to observe nearby. It is clear 
that exploration and exploitation are two competing tasks. Population-based 



heuristics (where genetic algorithms 19|] and estimation of distribution al 



gorithms 2^ are found) are powerful in the exploration of the search space, 
and weak in the exploitation of the solutions found. 

The integrate solution of the aforementioned difficulties (lots of con- 
straints, immediate response and exploration-exploitation trade off) in the 
optimal ship routing problem is the aim of this work. The material is orga- 
nized as follows: In section [21 a brief review of the optimal ship routing prob- 
lem is given. In section [3l a detailed description of the proposed algorithm 
is given. Section H] summarizes experimental results, while the conclusions of 
the implementation of the new algorithm are presented in section O 



2. The optimal ship routing problem 

Let us assume that an initial route of a ship is represented by a smooth 
curve f(s) (Fig. [1]) where the parameter s is the arc length measured from 
some fixed point A (initial point of the ship route). Then, the tangent vector 
of the curve of the ship's route in the point of question is defined as 

^ f dr , , 

^ = ^ = ^ (1 

\r\ ds ^ ' 

where r is the ship's velocity. 

We also assume that the moving ship is subject to the influence of the 
wave height and direction represented by the vector w and the wind speed 
and direction represented by the vector v. 
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Under the above assumptions we define a route cost (a scalar quantity) 
assigned at every possible route between two points. The route cost includes 
a weighted combination of the voyage time and the safety (or comfort) of the 
voyage. The total cost S is given by 



where T is the total voyage time and C is a scalar characterizing the safety 
(comfort) of the voyage. The weight a can be tuned by the user depending 
on his demands. Note here that when a = 1, then the only optimization 
parameter is the voyage time while when a = 0, the only optimization pa- 
rameter is the crew comfort. The scalar C is calculated as a line integral 
over the route by the following way (up to the linear approximation): 



where v is the wind vector, w is the wave height vector and Z„ and are 
tensors which characterize the ship response to wind and wave, respectively. 
The calculation of the total voyage time T, is a bit more complicated, since 
both wind and waves can alter the speed of the ship. In general we can write 
that 



where u is the speed of the ship in zero wind and F is a function that depends 
on ship characteristics, wind, wave and direction of the ship movement. For 
simplicity in the present work, we assume that F — 0. Moreover, we assume 
that a candidate route can be represented as a set of way-points, while the 
path between two successive way-points is always a straight line (or a great 
circle on the globe). Obviously, the coordinates of the way-points is the 
objective of the search process. Without loss of generahty, we assume that 
the departure and arrival points lay on the horizontal axis (if not, we can 
always rotate the coordinate system). In order to minimize the search space, 
we can assume that the horizontal positions of the way-points are fixed, 
and the objective of the search process is the determination of the vertical 
coordinates of the way-points. This simplification not only is acceptable but 
is indicated from the fact that the environmental parameters who affect the 
optimal route are known in a grid (the grid of the forecasting model used to 
account for the state of the sea during the next hours or days). Therefore, 



S^aT+{l-a)C 



(2) 




(3) 



r(r) \ — u + F(v, w, 



(4) 
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more than one way-points in the neighborhood of a grid point cannot be 
optimized efficiently (since no extra information is available for these points 
from the forecasting model). 



3. The new algorithm 

As it was mentioned before, there are three main difficulties that one 
has to overcome, in order to produce an efficient and operational algorithm 
for the optimal ship routing. In the following, a detailed view of how the 
proposed algorithm handles them is presented. 

3.1. Constraints in ship routing 

Constraints in ship routing arise from the existence of obstacles (islands) 
and general international or national navigation rules. The penalty encoding 
method perhaps is the most popular approach used in Genetic Algorithms 
for constrained optimization problems because of its simplicity and ease of 



implementation (|2l|, 22|, 23|, 2J], 25| and 26|). On the other hand, one can 



take extra actions in order to limit the members of the population in the 
feasible region of the search space. There are several techniques that have 
been proposed, from modified mutation and crossover operation, with the 
property that they only produce feasible offsprings from feasible parents to 
the death penalty method, in which a member is destroyed, if it violates a 
certain constrain. But in the case of the optimal ship routing this method is 
almost unapplicable. The reason is the large numbers of constraints (islands) 
which makes extremely difficult to design mutation and crossover operators 
which produce feasible offsprings. A nice short review of proposed methods 
for handling constraints in genetic algorithms with selected references can be 
found in [27 ]. 

3.1.1. General penalty formalism 

In Fig. [2] a route between two points A and B is shown. This route 
crosses an obstacle and divides it into two parts. Si and S^. In order to 
assign a constrain with the obstacle, we calculate the ratio 

_ min{Si,S2} _ Si 
"-'maxiSi^S,}-'^, 

If the route does not cross the obstacle, then the area of 5*1 is zero and that 
of 5*2 is the area of the whole obstacle. Thus, the parameter h takes the value 
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if the route does not cross the obstacle and a negative value otherwise. In 
the case where the obstacle is divided in exactly two equal areas, h takes the 
value —1, which is the smaller value that can be assigned to h. It is clear 
now, that whatever are the actions of the optimization algorithm, the value 
of h has to be increased. Moreover, this encoding of the constraint shows us 
the direction of the change that have to be induced in the route. As it is 
shown in Fig. [21 the dashed route decreases the value of h, while the dotted 
one increases it. 

Consider now that we have obstacles and for each one of them we 
calculate the parameter h. Then, for a feasible solution the following equation 
must hold: 

hi = Q _ i = i,...,Ar (6) 

On the other hand, there is another type of constraints which are caused 
by a practical inability of a ship to follow abrupt changes in the movement 
direction. If we force the ship to take sharp turns, then this might cause 
safety problems especially in heavy seas. Consider a route which is composed 
by straight lines joining the way points Pq, Pi, Pm, where Pq and Pm is 
the starting and ending point respectively. For each part of the route, we 
calculate the direction vector c/* 

pi pi— 1 

d' = — — -—j,i = l,...,M (7) 
p« _ p«-i 

where P' is the position vector of point Pi. Then, for a feasible solution we 
want that 

/ = cos-' [d^ . t/^+i) - > , A; = 1, M - 1 (8) 

where ipmax is the maximum allowed turn that the ship can take. Thus, for 
a given route x we have two types of constraints, i.e. N equalities h\x) 
and M — 1 inequahties g^{x) (equations ([6]) and ([8]) respectively). Note here 
that a route is represented by a vector containing the M + 1 way points 
Pq, Pi, Pm- If now 6{x) is the Dirac delta function and u{x) is the step 
function with 

= I i ' - n (9) 

^ ^ I , X < ^ ^ 

then the ideal penalty function for a configuration x is given by: 

1 — u(q(x)) 1 , , 

u{g{x)) 5{h{x)) 
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where: 

• the first term is zero, if the inequalities g hold and tends to infinity 
otherwise 

• the second term is zero, if equalities h hold and tends to infinity oth- 
erwise 

3.1.2. Smooth penalty formalism 

Instead of using the discontinuous functions u{x) and S{x), we can ap- 
proach them with the C°° functions 

u^{x) = I 1 - ^"'^ ' ^ < (11) 

^ ^ I 1 ,a; > ^ ^ 



and 



1 . X ^ ^ 



-TT U< 

, -i < X < i (12) 

r-/\| 'a — — a V/ 

Sa{x) ^ 1 , x > i 



a 



where both Ua{x) and Sa{x) tend to u{x) and 6{x) respectively when a ^ oo. 
The penalty function is given now 

1 — Uaiqix)) 1 

P{x) = . + 13 

uMx)) 6b{h{x)) 

for some given a, b. It can be easily shown that both functions ita and 6^^ are 
C°° functions everywhere in R. Furthermore, an obvious advantage of these 
two functions is that although they are smooth, they add no penalty at all 
to feasible solutions. Figure [3] shows the functions Ua and for different 
values of a. 



3.1.3. Complex cost function 

Let us assume now that the problem under consideration can be reduced 
to the minimization of the everywhere positive function S{x) given by equa- 
tion ([2]). Consider the complex function K{x): 

K{x) = S{x) +%■ P{x) (14) 
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Minimization of |A(x)| is now equivalent to our problem. A feasible solution 
to our problem must lie on the real axis (in our case it is the x-configuration 
space). This can be smoothly achieved by considering the generalized cost 
function E given by: 



It is clear now that we restrict the feasible space in the zone < P{x) < j. 
Finally, a sort of annealing is introduced here, pushing A to oo and thus 
moving the solution to the real axis which is the feasible space of the problem. 

3.2. Operational principles 

Having in mind that a route may contains 10 to 20 way points as it 
will be explained later and the search space is the Eastern Mediterranean 
with hundreds of islands, we find that a candidate solution has to fulfill 
hundreds of constraints, which makes the implementation of the algorithm 
to a single computer impractical. Fig. H] shows the application of a typical 
genetic algorithm with 8-bit encoding for each way point, 10 way points for 
each route and the penalty method described earlier. In these experiments, 
the parameter A of equation (ITSl) had the same value as the parameter a of 
equations ffTTj) and (fT2l) . Finally, the experiment was carried on a Pentium® 
4 at 2.AGHz. This experiment tests the quality of the solution depending on 
the annealing rate and thus on the cpu time. It is clear that the algorithm 
is sensitive to the rate of annealing, while for slow annealing, although we 
obtain good solutions, the computational time is inhibitory. 

The use of a parallel system or a grid of computers is thus necessary in 
order to obtain practical response times. Two approaches have been tested. 
The first one is to facilitate the searching mechanism by pre-calculating all 
the possible bypasses of the obstacles between the first and last way points. 
For each one of the pre-calculated routes, we generate a population of routes 
that are close to the initial ones. Each population is evolved using a death 
penalty mechanism for handling the constraints. Finally, the best solution 
is selected. The results of this approach are shown in Fig. The cost 
of the final solution is drawn as a function of time, for several numbers of 




(15) 



where the multiplicative term p is given 




Jm(A) > i 
< /m(A) < 




(16) 
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obstacles between the first and last way point. In Fig. the speedup of 
the implementation of the method in a cluster with 8 nodes is shown. Since 
for every obstacle added between the first and final way point the number of 
possible bypasses are doubled, this method fails for long routes which have 
to bypass tenths of islands. 

The second approach to the problem is based on the synergetic action 
of two algorithms, the genetic algorithm and the estimation of distribution 
algorithm, as it will be explained in details in the following paragraphs. 

3.3. The exploration exploitation trade-off 

The first difficulty which arises in the optimal ship routing is the large 
dimension of the search space. Consider the case where we want to approx- 
imate the optimal solution between two points which lie on the x-axis with 
a set of M way points between the first and the last one. As it was men- 
tioned before, it is reasonable to fix the z-coordinate of the way points and 
try to optimize the ?y-coordinates of the way points. Thus the search space 
is M-dimensional. Since we are using forecasting data for the sea state and 
wind and a typical grid size for such forecasting models is 0.1°, a set of 20 
way points will be adequate to cover every small or medium size ship voyage. 
Moreover, it is reasonable to bound the ^-coordinates of the way points in a 
rectangle with the size of the edge parallel to the x-axis to be the distance 
between the first and last way point and the size of the edge parallel to the 
?/-axis to be double. If we use a binary encoding for every ^/-coordinate with 
n-bits, then the search space is divided in M ■ 2" cells with surface area AS* 
given 

M -2' 

where d is the distance between the first and last way points. The value of 
AS is a measure of the exploration-exploitation of the genetic algorithm. A 
large value for AS* will soon produce a solution which locates in general the 
limits of the ^-coordinates, while a small value for AS" will produce a fine 
tuning of a given route. 

This observation lead us to the construction of several population with 
different size in the bit encoding of the ^/-coordinates. Moreover, population 
are organized in a hierarchical network, in which members of populations with 
/ bits per coordinate in the bit encoding can migrate only to populations with 
/' > I bits per coordinate. The result of this approach is that the populations 
in the upper levels of the network converge fast to solutions (due to the 



= — (17) 
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small number of cells in the search space that they have to look in) and this 
information is forwarded in population of lower levels in the network, where 
fine tuning (exploitation) is performed. 

3.4. Hybrid GA-EDA Algorithms 

The final part of the proposed algorithm is the way that members from 
one population can migrate to another. The hybrid GA-EDA (genetic algo- 
rithm - estimation of distribution algorithm) technique is used. The benefits 
of this approach are (a) the saving of communication time between parallel 
processes and (b) the inclusion of an extra searching mechanism. More specif- 
ically, since members from one population have to migrate in order to carry 
information about the search space, the amount of data that are transferred 
between processes affect dramatically the speedup of the parallel algorithm 
(the communication time is at least four orders of magnitude bigger than the 
processing time) . Thus it is by far more efficient to migrate the distribution 
function of the genes of the members than the members themselves. 

On the other hand, the original objective is to get benefits from both 
approaches. The main difference from these two evolutionary strategies is 
how new individuals (offsprings) are generated. Our new approach generates 
two groups of offspring individuals, one generated by the GA mechanism 
and the other by EDA one. GAs use crossover and mutation operators as a 
mechanism to create new individuals from the best individuals of the previ- 
ous generation. On the other hand, EDA builds a probabilistic model with 
the best individuals and then samples the model to generate new ones. Pop- 
ulation p -|- 1 is composed by the best overall individuals from (i) the past 
population, (ii) the GA-evolved offspring, and (iii) EDA-evolved offspring. 
The individuals are selected based on their fitness function. This evolution- 
ary schema is quite similar to Steady State GA in which individuals from 
one population, with better fitness than new individual from the offspring, 
survive in the next one. 

4. Experimental results 

In order to test the proposed algorithm, the optimal route is calculated 
from the port of Thessaloniki (40.51977V, 22.9709E') to the port of Ag. Niko- 
laos (35.1508Ar, 25.7227E'). Forecast data are taken from climate databases. 
There are more than 20 islands which give more than 2^° ways to bypass 
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them. Fig. shows the cost of the calculated route as a function of com- 
putational time and this is compared to the cost of a route calculated using 
simulated annealing. The cost of the shortest path is also drawn in the same 
figure. The proposed algorithm gives operational results in at least 2 orders 
of magnitude less time than simulated annealing. Finally these results have 
been reproduced several times using variable environmental conditions. 

5. Conclusion 

In this work a hybrid evolutionary method based on genetic algorithms 
and estimation of probability distribution algorithm has been designed and 
implemented to deal with the problem of optimal ship routing. The large 
number of constraints lead us to the development of a parallel system in 
order to produce good solution in reasonable times and thus integrate this 
method in an operational advisory system. The basic parts of the method 
are: 

1. A smooth penalty function has been constructed to deal with con- 
straints which by using an annealing mechanism forces the searching 
to be bounded in the feasible area of the search space. 

2. In order to account with the dependence of the convergence of the algo- 
rithm on the annealing rate, different populations evolve with different 
annealing rates. This, in combination with the exchange of groups of 
members between the populations, unlocks the algorithm from the local 
minima. 

3. In order to handle the exploration exploitation trade-off, different popu- 
lations evolve with different binary encoding, while all of them cover the 
whole searching space. Populations with smaller size in bit representa- 
tion of the solution have a better exploration performance, while those 
with larger size have better exploitation capabilities. 

4. Information between populations are exchanged using the estimation of 
distribution algorithm. This, not only decreases the communication cost 
between the nodes of the parallel system, but includes an extra mutation 
operation which improves the searching capabilities of the algorithm. 

Experimental results both from simulation and real data show that the sys- 
tem meets its specifications and thus could be used in operational mode. 
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Figure 2: A route from point A to point B which crosses an obstacle (soUd route). The 
obstacle is divided by the route into two areas, and S2. The ratio of the smaller area 
to the larger one, is decreased in the right direction (dashed route) while is increased in 
the opposite direction (dotted route). 
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Figure 4: The application of a typical genetic algorithm with 8-bit encoding for each way 
point, 10 way points for each route and the penalty method described in section r3.1.3l In 
these experiments, the parameter A of equation (|15p had the same value as the parameter 
a of equations pT|) and p2|) . Finally, the experiment was carried on a Pentium® 4 at 
2AGHz. The parameter g is the percentage of the increase of A in every generation. 
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Figure 5: Precalculation of obstacle bypasses facilitate the convergence when the number 
of obstacles N is small, but very soon, with increasing A'' the cpu time becomes inhibitory. 
The speedup of this implementation, although close to ideal, cannot solve the problem. 
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Figure 6: The cost of the optimal route from the port of Thessaloniki (40.5197iV, 22.9709£;) 
to the port of Ag. Nikolaos (35.15087V, 25.7227E'), calculated by the proposed GA-EDA 
algorithm (*) compared to the cost of the route calculated using simulated annealing (x). 
The cost of the shortest path is drawn too (solid line). 
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